Automatic Categorization of Google Search Results for Medical Queries using JDI

نویسندگان

  • Anantha K. Bangalore
  • Guy Divita
  • Susanne Humphrey
  • Allen Browne
  • Karen E. Thorn
چکیده

The web has become the primary source of medical information for consumers and health professionals. It is quite common for people to “Google” for information related to a medical topic. But the problem remains that as the number of documents increases on the web, the difficulty in quickly locating the best documents increases. Classifying results into meaningful categories, helps guide users to the most relevant set of results. Journal Descriptor Indexing (JDI) is a novel approach to fully automatic indexing. In this paper we explore the feasibility of using JDI to organize Google search results for medical queries into meaningful categories. For our experiments, we used JDI in combination with a set of heuristics to automatically categorize the search results for 5 query terms. Three independent reviewers reviewed and evaluated the automatic categorization for 3 documents for each query term. The results clearly suggest that this method offers promise. Additional work for improving the categorization as well as to determining whether a term is medical or not is also discussed.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparing a rule-based versus statistical system for automatic categorization of MEDLINE documents according to biomedical specialty

Automatic document categorization is an important research problem in Information Science and Natural Language Processing. Many applications, including Word Sense Disambiguation and Information Retrieval in large collections, can benefit from such categorization. This paper focuses on automatic categorization of documents from the biomedical literature into broad discipline-based categories. Tw...

متن کامل

'surfing for knowledge' finding semantically similar Web clusters

In this paper we present our technique for finding semantically similar clusters within web documents obtained from a set of queries retrieved from the Google search engine. This technique utilizes a clustering algorithm based on previous Latent Semantic Analysis (LSA) work pioneered by Deerwester. In this paper we demonstrate how by using our clustering algorithm we can resolve ambiguities pre...

متن کامل

Automatic Acquisition of Synonyms Using the Web as a Corpus

We present an original algorithm for automatic acquisition of synonyms from text. The algorithm measures the semantic similarity between pairs of words by comparing their local contexts extracted from the Web by series of queries against the Google search engine. The results show 11pt average precision of 63.16%.

متن کامل

Real time search on the web: Queries, topics, and economic value

Real time search is an increasingly important area of information seeking on the Web. In this research, we analyze 1,005,296 user interactions with a real time search engine over a 190 day period. Using query log analysis, we investigate searching behavior, categorize search topics, and measure the economic value of this real time search stream. We examine aggregate usage of the search engine, ...

متن کامل

Analysis of users’ query reformulation behavior in Web with regard to Wholis-tic/analytic cognitive styles, Web experience, and search task type

Background and Aim: The basic aim of the present study is to investigate users’ query reformulation behavior with regard to wholistic-analytic cognitive styles, search task type, and experience variables in using the Web. Method: This study is an applied research using survey method. A total of 321 search queries were submitted by 44 users. Data collection tools were Riding’s Cognitive Style A...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008